Leveraging Site Search Logs to Identify Missing Content on Enterprise Webpages
نویسندگان
چکیده
Pearson’s rank correlation coefficient (r) between the vectors of counts and residual values over all tuples was found to be very close to zero (−0.035). Kendall rank correlation coefficient τ between the ranked lists when (w, q) tuples are ordered by frequency and residual value, was found to be −4.65 × 10−9 . This indicates almost no correlation between counts and residuals. Problem Statement
منابع مشابه
Data Preparation for Web Mining – A survey
An accepted trend is to categorize web mining into three main areas: web content mining, web structure mining and web usage mining. Web content mining involves extracting details/information from the contents of webpages and performing things like knowledge synthesis. Web structure mining involves the usage of graph theory to understand website structure/hierarchy. Web usage mining involves the...
متن کاملA Novel Architecture for Detecting Phishing Webpages using Cost-based Feature Selection
Phishing is one of the luring techniques used to exploit personal information. A phishing webpage detection system (PWDS) extracts features to determine whether it is a phishing webpage or not. Selecting appropriate features improves the performance of PWDS. Performance criteria are detection accuracy and system response time. The major time consumed by PWDS arises from feature extraction that ...
متن کاملLong-Term Learning for Web Search Engines
This paper considers how web search engines can learn from the successful searches recorded in their user logs. Document Transformation is a feasible approach that uses these logs to improve document representations. Existing test collections do not allow an adequate investigation of Document Transformation, but we show how a rigorous evaluation of this method can be carried out using the refer...
متن کاملUsing the Results of CPTu to Identify the Subsurface Sediment Layers in Urmia Lake Bridge Site, NW Iran
Specifying the soil types and profiling the subsurface soil layers are the excellent examples of CPTu test potentials. In this research, the capability of CPTu test for specifying subsurface soil layers and classification of sediments in Urmia Lake is investigated. According to previous studies, the sediments of Urmia Lake are commonly fine grained and soft deposits with organic materials. To e...
متن کاملKeyword Extraction for Webpage Clusters
The volume of unstructured information presented on the Internet is constantly increasing, together with the total amount of websites and their contents. To process this vast amount of information it is important to distinguish different clusters of related webpages. Such clusters are used, for example, for template induction, keyword extraction, and recommendation algorithms. A variety of appl...
متن کامل